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Abstract 

We present a new, biased Monte Carlo scheme for simulating complex, cyclic 
peptides. Backbone atoms are equilibrated with a biased rebridging scheme, 
and side-chain atoms are equilibrated with a look-ahead configurational bias 
Monte Carlo. Parallel tempering is shown to be an important ingredient in the 
construction of an efficient approach. 
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1 Introduction 



Peptides are of fundamental importance in biological systems. They regulate homeosta- 
sis, particularly thirst, feeding and pain pQ, serve as important signaling molecules in the 
nervous system and are used as a chemical defense mechanism by some organisms [B]. 
Peptides have been used within the biotechnology industry to identify antagonists blocking 
various abnormal enzymatic reactions or ligand- receptor interactions [3]. Cyclic peptides 
or constrained peptides are often preferred for this application, since such molecules lose 
less configurational entropy upon binding [S] . Cyclic peptides have backbones with a cyclic 
topology that is formed either by the condensation of two sulfhydryl (-SH) groups from 
two cysteine side chains or by the dyhydration of the head NH2 and tail COOH groups. A 
classic example of using a cyclic peptide as an antagonist is the blocking of platelet aggrega- 
tion by RGD peptides. The GPIIb/IIIa-fibronectin interaction is known to be responsible 
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for blood platelet aggregation Roughly eight cyclic peptides of the form CRGDxxxC, 
1 1 1 1 

CxxxRGDC, and CxxxKGDC that are effective platelet aggregation blockers were iden- 
tified [Zj. Several companies are now pursuing organic analogs of these RGD peptides in 
clinical trials. Although no successful drug has yet been designed by purely computational 
methods, the discovery of the RGD peptide and roughly thirty other pharmaceuticals has 
benefited in some way from computer simulation j^j. 

Simulation of complex biomolecules with standard Metropolis Monte Carlo or conven- 
tional molecular dynamics, however, often fails to sample conformations from the correct 
Boltzmann distribution. The difficulty lies in the intrinsic high energy barriers between the 
conformations adopted at room- or body-temperature, barriers that cannot be overcome 
with these methods. High temperature or potential-scaled ^U] molecular dynamics can 
cross these barriers, but these methods sample from a distribution that is not the one of 
interest. 

The configurational bias Monte Carlo method (CBMC), first developed by Frenkel, 
Smit, and de Pablo [HJII2], successfully samples complex energy landscapes by using local 
information when proposing moves. This method has been successfully applied to long 
chain molecules ^3] , phase behavior of long chain alkanes [T3J , and conformations of 
hydrocarbons within zeolite channels ^B]. A combination with a generalized concerted 
rotation scheme, inspired by the method for alkane chains |18j . has been applied to the 
simulation of linear and cyclic peptides [Hj . This approach proved to be especially efficient 
in sampling cyclic peptides with barrier-separated conformations, even when the location 
of the conformation and energy barriers were not known a priori. For cyclic peptides, this 
method changes conformations locally by perturbing backbone segments of two to three 
amino acids. Such moves have the potential to equilibrate large molecules with complex 
topologies. 

Despite the successes of the configurational bias and concerted rotation scheme, dif- 
ficulties still remain for complex cyclic peptides. Long or bulky side chains are not well 
equilibrated, for example. In some cases, the backbone of cyclic peptides is not sam- 
pled efficiently, due to the unpredictable presence of large barriers in the multidimensional 
torsional-angle free-energy landscape. We present an integrated methodology for simu- 
lation of cyclic peptides. Special concerns are given to optimizing and quantifying the 
efficiency of our method. We propose a peptide rebridging scheme, inspired by a method 
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proposed for polymers (THl EUl EEj and suitable for backbone equilibration of peptides. 
Eight torsional degrees of freedom are altered with this backbone move. Implementation 
of the move is reduced in all cases to the solution of a one-dimensional numerical problem. 
Four approaches to biasing the rebridging moves are proposed and compared. For side 
chain regrowth, we propose two new methods, 'semi-look-ahead' and 'look-ahead', inspired 
by Meirovitch's lattice scanning method [22] ■ We find that both methods equilibrate side 
chains rapidly. We compare their efficiency and discuss optimal parameter values. 

For the most complex cyclic peptides, biased Monte Carlo is still not optimally effi- 
cient. To overcome the remaining barriers to effective sampling, we add parallel tempering 
to our range of techniques. Parallel tempering is a rigorous Monte Carlo method, first 
proposed for the study of glassy systems with large free energy barriers |2Sj- This method 
has been successfully applied to spin glasses |2H EE] , self-avoiding random walks [2E] , lat- 
tice QCD [21], linear peptides [2H], and crystal structure determination [2H]- In parallel 
tempering, we consider a set of identical systems, each at a distinct temperature. Each 
system is equilibrated with both updating and swapping moves. The swapping moves cou- 
ple the systems in such a way that the lowest temperature system is able to escape from 
local energy minima without explicit knowledge of the barriers. This method achieves rig- 
orously correct canonical sampling, and it significantly reduces the equilibration time. We 
show that the combination of biased Monte Carlo and parallel tempering achieves effective 
sampling, quickly overcoming energy barriers and approaching the Boltzmann distribution. 

We define our all-atom, molecular model of peptides in Sec. 12.11 The peptide rebridging 
scheme is described in Sec. 12. 2| where technical details are provided. This section can be 
skipped on a first reading, as it is simply an extension of the method in ref. to include 
pre-screening. How biasing can be done is discussed in Sec. 12.31 In Sec. 12.41 we apply 
the concept of parallel tempering to our system. The 'semi-look-ahead' and 'look-ahead' 
methods for side chains are presented in Sec. 12.51 and Sec. 12.61 respectively. Results for 
the simulation of complex, cyclic peptides are given in Sec. El where the efficiency of our 
approach is demonstrated. We discuss the results in Sec. and make our conclusions in 
Sec. ED 

2 Simulation Methods 
2.1 Molecular Model 

We chose to use the AMBER force field J3H] with explicit atoms. Other suitable poten- 
tial models are ECEPP [31] and CHARMm [32] . Dielectric theory was used to estimate 
solvent effects [33J. Fast coordinates such as bond lengths and bond angles were fixed at 
their equilibrium value. Only the biologically-relevant, torsional degrees of freedom were 
sampled. Nonetheless, this method can be easily generalized to flexible systems. With 
this assumption, a molecule is comprised of a set of so-called 'rigid units'. Following the 
definition in ref. [TZ], a rigid unit consists of a set of atoms and bonds that form a rigid 
body. The relative distance between any pair of atoms within a rigid unit is constant. 
Adjacent rigid units are connected by a single sigma bond. 

The rigid units are labeled from the head NH2 to the tail COOH group of the peptide. 
Each rigid unit has exactly one incoming bond that starts from the previous unit and 
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ends within it. All other bonds that leave the unit are defined to be outgoing bonds. For 
example, a Cq,H unit has two outgoing bonds, the first going to the residue and the second 
going to the next backbone unit. For unit i, we define 6*j to be the angle formed by the 
incoming bond and the outgoing bond to the next backbone unit. The atom that ends the 
incoming bond is defined to be a head atom, and the atom that starts the outgoing bond 
is defined to be a tail atom. We define and r it to be the positions of the head and tail 
atoms of unit i, respectively. 

Rigid units that appear in the backbone are divided into two topological types. Type 
A includes all rigid units with identical head and tail atoms. Type B includes the CONH 
amide group, which has 9i = 0. Figure El illustrates the geometry of these two types and 
the definitions of 8^ and da, which are the angles spanned by — i"ih and the incoming 
and outgoing bonds, respectively. 

2.2 Rebridging Scheme 

i 1 

We display in Fig. [TJa typical cyclic peptide, CNWKRGDC. Although the chemical func- 
tionality of peptides lies mostly in the freely-rotating side chains, backbone equilibration is 
important since the backbone serves as a scaffold for the side chains. We, therefore, use two 
types of biased Monte Carlo moves, chosen at random: movement of a random segment of 
the backbone with rigid rotation of the associated side chains and regrowth of a randomly 
picked side chain. Here we describe the backbone move, a peptide rebridging scheme. 

The peptide rebridging scheme is inspired by the concerted rotation JH] and rebridg- 
ing 19, moves for alkane chains and the extension of concerted rotation to peptides [TT] . 
Peptide rebridging causes a local conformational change within the molecule, leaving the 
rest of the molecule fixed. Rebridging moves are not only suitable for cyclic peptides but 
also suitable for the internal parts of larger linear peptides and proteins. The main fea- 
tures that distinguish our rebridging scheme are the pre-screening process, more degrees 
of freedom per move, and more efficient biasing. We proposed five variations of rebridging 
moves, differing in the probabilities of choosing one of the many possible geometric solu- 
tions. They are Metropolis (MT), no Jacobian (NJ), with Jacobian (WJ), with Jacobian 
and old solutions (WJO), and with Jacobian and multiple rotations (WJM). Here we de- 
scribe WJ. Other variations will be described in Section |2~31 Peptide rebridging is carried 
out in several steps: 

1. Randomly select two torsional degrees of freedom that are separated by six other 
torsional degrees of freedom. We label the two torsional angles as 0o an d <fii- The 
eight rigid units, including both ends, are labeled from unit to unit 7. Backbone 
positions are denoted r ia , where i = 0, . . . , 6 and a = h (head) or t (tail). Figure El 
depicts a segment that is selected to be rebridged. 

2. The angles 4>o and 07 are rotated, causing the rigid units between and 6 to change 
while leaving the rigid units before and after 6 unchanged. The range of rotation 
is within ±A0 max . The two rotations break the connectivity of the molecule and 
provide new trial positions for r lh and r 5t . We denote the new values by (f)' and 0' 7 . 



4 



3. Find all geometrical solutions, <f)' , . . . ,<f)' 7 , that re-insert the backbone units in a valid 
way between rigid units 1 to 6. How we solve this geometrical problem will be 
described below. If no solution is found, this move is rejected. Otherwise, calculate 
the Rosenbluth factor, W^, which is defined as 

W (D) = Ejl n) exp(-/3uS n) ), (1) 

i=l 

where k^ is the number of geometrical solutions found. The Jacobian associated 
with the constraints for the ith solution is jj . 

4. Pick a solution from these few solutions with probability 



= ^, j (2) 



5. Solve the geometrical problem corresponding to O an d 07- These solutions include 
the old configuration (fi , . . . , 7 and are used to calculate the old Rosenbluth factor 

W<°> = £j! o) exp(-/M o) ), (3) 
i=i 

where is the number of solutions in the old geometry. 

6. The attempted move is accepted with the probability 

/ w( n ) \ 

acc(o^n) =mm I . (4) 

The Jacobian in eqs. and (JBJ) accounts for the fact that when we solve for the angles 
, . . . , 06, we do not produce uniform distributions. The Jacobian is defined by 

u 6 ■ e 3 




det |B| 

= [uj x (r 5t - r jh )]i, if j < 3, 

[Uj x u 6 ]j-3, if j = 4, 5. (5) 

Here Uj is the unit vector of the ith incoming bond, and e 3 is a unit vector along the labo- 
ratory z-axis. The Eulerian angle 76 is the azimuthal angle of U7 in a spherical coordinate 
system defined with U6 as the z-axis. The angle is measured with respect to the plane 
defined by U6 and 63. It is worth mentioning that in refs. ^Zj and [IB], the Jacobian lacked 
the U6 • e 3 term. The Jacobian should be invariant under orthogonal transformations, but 
the Jacobians in refs. fTTJ and [TH] are not. Despite this, proper sampling was attained, 
because the omitted terms cancel in the acceptance ratio. This cancellation does not occur 
in rebridging, since U6 is changed by the rotation of <pj. The Jacobian appears as a conse- 
quence of the end-atom constraints in the canonical partition function of a constrained or 
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cyclic molecule. The Jacobian is derived in Appendix where a careful discussion of the 
cyclic constraint is given as well. 

The geometrical problem in rebridging is solved by seeking conserved quantities. It is 
conceptually helpful to imagine a break point in the segment to be regrown. The rigid 
units before the break point are built upon the positions of the preceding units, whereas 
the rigid units after the break point are built upon the positions of the following units. 
When i-jh and r(i-i)t are expressed in local coordinates of the (i-l)th unit, the positions 
are said to be defined in 'forward notation'. When we build up these positions from the 
opposite direction, the positions are said to be defined in 'backward notation'. How we 
choose the break point depends on the identity of the rigid units to be regrown. Rigid 
units before the break point are always defined by forward notation and rigid units after 
the break point are always defined by backward notation. With forward notation we have 
r lt = r lt (>i), r 2h = r 2h (0i), r 2t = r 2t (0i,0 2 ), r 3h = r 3h (0 1 ,0 2 ), and so on. With backward 
notation, we have r 5h = r 5h (0 6 ), r 4t = r 4t (> 6 ), r 4h = r 4h (0 6 ,0 5 ), r 3t = r 3t (> 6 ,0 5 ), and so 
on. 

We use a variant of Flory's local coordinate system |34j . The system was modified for 
units with 9i = to reduce the number of variables appearing in the constraint equations. 
The general formulas for r( i+1 ) h (0i, . . . , and r it (4>i, . . . , 4>i) are 

r(i+i)h(0i, . . . , 4>i) = r it (4> h ...,4>i) + / it ,(i+i)hTj ab Ai$i 

r it (0i, . . . , fa) = r ih + Zfl^T^AihSi (6) 

where 

/ cos 6i \ 
Ai = sinfli 

V sin 9i J 

( cos 9 ih 0\ 
A ih = sin 0a 

V smd ih J 

= ( cos0i j (7) 

V sm J 

Here Uajb denotes the constant distance between r ia and r^. We call unit i the reference 
unit of unit i + We use the form of eq. © because it explicitly isolates the terms involving 
the variable 0,. 

The labels a and b can be either h or t. This notation is dropped when the unit is an A 
unit. For example, li^h says unit 1 is an A unit and defines the distance between and 
r 2 h- In this case we also drop the head or tail notation for vectors and write ri = = ri t . 

The transformation from local coordinates to the laboratory coordinates in forward 
notation is 



lab 



Tl ab (0i, 



T, fl = 




6 




(8) 



Here u«, v», and w» are the axes of the local coordinates of unit i in forward notation in 
the laboratory frame. The last matrix is modified from Flory's coordinate system. It is 
defined so that T' ab = Tj_ x when 9i = 0. This definition simplifies our algorithm. 

For the rigid units beyond the break point, we use backward notation. In backward no- 
tation, unit i + 1 is the reference unit of unit i. The general formulas for r( i _ 1 ) t (0 6 , . . . , 4>i+i) 
and r ih (0 6 , • • ■ Ai+i) are 

r(i-i) t (06, • • • , 4>i+i) = r ih (06, • • • , <f>i+i) + ^(i-i)t,ihTl ab Ai* i+ i 

r ih(06; • • • > 4>i+l) — r it + ^h,itTj a A it $j +1 

/ cos 9 it \ 
A<t = sin ft . (9) 
V sinflft/ 

The transformation from local coordinates to the laboratory coordinates in backward no- 
tation is 

T[ ab = ( x ? , % z, ) 

= T5 ab T 6 ^T 5(9 T 5 ^ • ■ ■ T( i+2 ^T( i+1 ^ . (10) 

Here 5Q, yj, and Zj are the axes of the local coordinates of unit % in backward notation in 
the laboratory frame. 

For all rebridging cases that we consider, it is possible to find three constraint equations 
with three independent torsional angles and to determine the solutions by solving a one- 
dimensional equation numerically. The constraint equations vary depending on the types 
of the units 1 to 5. Tabled lists the six distinct cases that can occur and the corresponding 
constraint equations. The dependencies of the backbone positions on the torsional angles 
are specified explicitly in the argument, and one can tell from the arguments if the positions 
are in forward or backward notation. Actually, cases 4 and 5 are mirror images of case 1 
and 2, with the rigid units labeled in the opposite direction. Strictly speaking, then, there 
are only four distinct cases: cases 1, 2, 3, and 6. 

In all cases, the first two constraint equations have at most two independent variables 
each. In one special case (case 3), an equation with only one variable is found. In the first 
three cases, the first two equations of each set are used to derive two torsional angles as 
analytic functions of <\>\. These two analytic expressions are in turn substituted into the 
third equation, which is solved numerically in the <f>\ domain. In cases 1 and 2 the other 
two independent angles are 02 and 06- Case 3 is special because 02 is a constant. The 
other torsional angle needed in the third constraint equation is 06- in cases 4 and 5, the 
approach is similar, except that the equations are solved numerically in the 06 domain. In 
case 6, the second and fourth rigid units are arbitrary and can be either A or B. This is a 
special case in which r 3 can be written as a function of a single, new torsional angle. This 
case includes, for example, ABABA, which corresponds to C Q -amide-C a -amide-C Q . It is 
obvious that the geometrical constraints keep the distances |ri — v$\ and |r3 — rs| constant 
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in all possible solutions. We also know the trial distance |ri — rs| after performing the two 
rotations. These distances are conserved in all solutions. Therefore, possible positions of 
r 3 should fall on the intersection of two spheres centered at ri and r 5 . Figure 0] shows the 
geometry of this segment and the conserved distances. If the triangle inequality 



/l,3 + h,5 > | r l — r 5 



trial 



11' 



holds, we can define a new set of local coordinates for unit 3: 



« / 



w 



(r 5 - ri)/Zi j5 

Ul X Uo/lux x u' 



™3 X U3 



'.12) 



to obtain an expression for r 3 as a function of a single, new angle 3 : 



r 3 ( 



3 ■ 



(13) 



where 



Wo 



A' 



e' 



cos 6' 3 










sin 6' 3 
sin 9' 



cos 



_i h,3 2 + h,5 2 — h,5 



2/1,3/ 



1,5 




(14) 



All of the constraint equations in tabled can be grouped by their functional forms into 
four types. The fourth column of table Q shows the type of each constraint equation. The 
general functional forms of these constraint equations are listed in table El The first type 
is a quadratic function of a single variable. The other types are functions of two torsional 
angles. They are based on either conserved distances, as in 'dist', or conserved angles, as 
in 'dot' and 'dotl'. The last column of table El lists the characteristic matrix, which is used 
in the pre-screening process and the evaluation of the third target function. 

We illustrate the peptide rebridging algorithm by taking case 6 in table Has an example. 
If eq. (ITT]) is not satisfied, the trial move is immediately rejected because of a geometrical 
failure. Otherwise, we go on. The first constraint equation allows us to express <\>\ in terms 
of 4>' 3 . To do this, we rewrite the constraint equation as 



= [r 3 ( 



r 2h ] T [r 3 ( 



r 2 h] - / 



2h,3 



and use eqs. (0) and eq. (|T3*|) to obtain 

= (Z li3 Tf 'A^s - Zi,2hTi ab A 1 * 1 ) T (Z 1 , 3 Tf 'A 3 $ 3 - h^T^A^ 



2h,3 



Zl,3 2 + h,2h 2 



I 



2h,3 



2/l,3/ 



l,2h 



$ 3 ' A 3 T T 3 ab 



Ti ab Ai$! 



(15) 



S 



We introduce the constant matrix 



C 



1 





(16) 



and multiply the first three constant terms in eq. (fTK|) by <fr 3 C$i, which is unity, to obtain 

*' 3 T M*! = , (17) 
where the constant characteristic matrix M is defined as 

M = (h/ + Zi, 2h 2 - Va 2 )C - 2/ li 3/ 1 , 2h A 3 T (T 1 3 ab, ) T T 1 1 ab A 1 . (18) 

In each case, the first two constraint equations can be cast into the form of eq. (|17|). 
The right hand column of table El lists the constraint equations and the corresponding 
characteristic matrix M for each case. Equation (fT%|) . for example, is a special case of the 
'dist' type constraint equation in tabled in which we have iy = ry = 
To solve the constraint equations, we set u>i = cos(0j/2) and use 



cos<k = (l-^ 2 )/(l + ^ 2 ) 
sin 0i = 2uJi/(l + ujf) 

to replace each cos 0, and sin (pi in eq. (fTTj) . We rewrite eq. (fT8|) as 



ft' M'Oi = 



(19) 



(20) 



where 



CO? 



The matrix M' is related to M by 
M' = 



M n + M12 + M 2 i + M 22 

2(M 31 + M 32 ) 
Mn + Mi 2 - M 2 i - M 22 



2(Mi 3 + M 23 ) 

4M 33 
2(Mi 3 - M 23 ) 



Equation (f2T)J) is quadratic in u\, and we find 

1 



(21) 



22 



Mn - M12 + M 2 i - M 

2(M 3 i - M 32 ) 
Mn - M i2 - M 2 i + M 22 



(22) 



-ci ± (c x - 4c c 2 



where 



Co 

Cl 
C-2 



2c 2 
/i±M > 



M' u + M> 3 + M 31 ^ 3 2 



(23) 



32^3 



M' 13 + M^.; + MU 



I 2 



23^3 



L 33^3 



(24) 
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Since uj 3 must produce a non-negative discriminant in eq. (|2H|) in order to produce a real- 
valued u\, pre-screening can be done by solving 

c\ - 4c c 2 = . (25) 

Equation (|23|) is a quartic polynomial equation. We use an eigenvalue method to solve this 
equation and to determine the valid domains of <ft' 3 in the first constraint equation [35]. 
Note that 0i = fi±(<f>' 3 ) has two branches. 

Following a derivation parallel to that in eqs. (fT5|) - (|2~5j) . we can write 6 = f2±(<fi 3 )- A 
similar pre-screening process is done to determine the valid domains of 4> 3 in the second 
equation. This pre-screening process reduces the CPU cost and increases the efficiency of 
the algorithm considerably. 

Evaluation of the third target function is performed over the valid domains of 0' 3 , which 
are the intersections of the valid domains found by pre-screening. The independent variable 
is chosen to be 0' 3 instead of u' 3 , since the latter may be valid on an infinite domain. To 
find the acceptable new rigid unit positions, the third target function is solved. To evaluate 
the target function, a series of calculations is repeated for each <p' 3 . First, we calculate the 
corresponding X and 6 . Second, we determine r 3 (0 3 ), r 2 h(0i), and r 4t (0 6 ). Third, we 
calculate r 2t and r 4h , which are uniquely determined by the trial r 3 (0 3 ), r 2h (0i), and r 4t (0 6 ) 
(see figure EJ). Finally, we substitute r 2t and r 4 h into the target function. We evaluate the 
target function on a grid, using a grid width of 0.003 radians. A finer grid is used when the 
function approaches zero. The function values so obtained are used to locate approximately 
the roots. Brent's method is used to refine the roots [33]. The roots for 3 are sufficient to 
determine all the backbone positions. Substituting each root into f x ± and / 2 ±, we obtain 0i 
and 06, and thus r 2 h, r 4t , and r 3 . Other backbone positions can be calculated easily. Side 
chains are rigidly rotated so as to connect to the backbone properly, and the geometrical 
problem is solved. 

For each valid 3 , there are two branches of the solution for 0i = /i±(0 3 ) and also 
two branches of the solution for 6 = / 2 ±(0 3 ). Therefore, the target function has four 
branches. Figure |5j shows a typical target function. In the case shown in figure 03 there are 
six solutions. 

In summary, the algorithm for solving the geometrical problem for case 6 works as 
follows: 

1. If the geometry does not satisfy eq. (JUJ), the move is rejected. 

2. Calculate the characteristic matrices, Mi and M 2 , of the first two constraint equations. 
Transform Mi to M' x and M 2 to M 2 , using eq. (|22jl . Find the intersection of valid 
domains using eq. (|25jl . If no common domain is found, the move is rejected. 

3. Search for roots of the third equation on the valid 0' 3 domains. Determine all the 
backbone positions associated with each solution for 3 . Determine the positions of 
all associated side chains. 

Other cases in table [T] are solved similarly, except that the independent variable is either 
0i or 6 . 
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2.3 Biasing of the Rebridging Moves 

There are several ways to bias solutions in the rebridging scheme. The first method is 
discussed in [T7] . The Rosenbluth factors are defined as 

fc(n) 

W (n) = ^exp(-/5U J (n) ) 

i=l 

W (o) = 5]exp(-/5uS o) ) (26) 
i=i 

The proposed move is accepted with the probability 

/ j( n )W (n) \ 

acc(o - n) = min (^1, j , (27) 

where J is the Jacobian. This method is called no Jacobian (NJ). 

A second method of bias, called with Jacobian (WJ), includes the bias introduced by 
the Jacobian within the Rosenbluth factors, as in eqs. (JIJ) and Q- The proposed move is 
accepted with the probability given by eq. (jlj). This approach is expected to achieve better 
sampling than NJ, since it explicitly includes the bias introduced by the Jacobian within 
the move (SEj- 

A third method of bias includes the old and new solutions within a single Rosenbluth 
factor. Solutions are picked with the probability 

v, = J<e ^ w > i = i,...,(jbW + ibW) 

W = W {o) + W {n) . (28) 

Here the Rosenbluth factors are, again, defined as in eqs. (QJ and Q. Such a move is 
always accepted, although the new state may be identical with the old state. This method 
is called with Jacobian and old solutions (WJO). 

It is possible to perform multiple rotations on O an d 07- This scheme must be based 
on WJ or NJ so as to satisfy detailed balance. We choose WJ and call this method with 
Jacobian and multiple rotations (WJM). For a rebridging move with fc max rotations, A; max — 1 
rotations around the old state must be performed to obtain a correct old Rosenbluth factor. 
A new configuration is selected from the solutions with the probability 

2^k=i vv fc 

where W^ is the Rosenbluth factor of the kth rotation, as calculated by eq. (JTJ). The 
acceptance probability is 

acc(o n) = min 1, % fc=1 , - 1 1 . (30) 
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The last method is based on Metropolis rules, in which a solution is picked at random 
without any bias, as in ref. ^B]- The picking probability and acceptance criteria are 

1 

Pl ~ W) 
acc(o — > n) = min 

The method is called Metropolis (MT). 

2.4 Parallel Tempering 

The use of biasing mitigates, but does not eliminate, the various free energy barriers in 
cyclic peptides. Even a small cyclic peptide is, in a sense, a 'glassy' system due to these 
significant and unpredictably-located free energy barriers. To deal with this issue, we use 
parallel tempering [2*3] . 

In parallel tempering we consider an extended ensemble with n systems, labeled as 
i — 1, . . . , n. Each system is a copy of the original system, except that each is equilibrated 
at a distinct temperature, T iy where % = 1, . . . ,n and T± < T 2 < . . . < T n . The canonical 
partition function of this extended canonical ensemble is given by 

n 

q=iiq<, (32) 

i=l 

where Qj is the individual canonical partition function of the ith system. Two types of 
moves are performed in the ensemble. The first is a regular Monte Carlo move within a 
randomly chosen system. The second is a swapping move. A swapping move proposes to 
exchange the configurations of the two systems i and j — % + 1, 1 < i < n. This move is 
accepted with the probability 

acc[(i, j) -> (j, i)) = min[l,exp(-/3 i U i - /3,-Uj + AU* + PjUj)] 

= min[l,exp(-A/3AU)] . (33) 

This technique forces each system to sample the Boltzmann distribution at the appropriate 
temperature. In our case, we are interested in the lowest temperature distribution only. 
The higher temperature systems are included solely to help the lowest temperature system 
to escape from local energy minima via the swapping moves. To achieve efficient sampling, 
the highest temperature should be such that no significant free energy barriers are observed. 
To ensure that the swapping moves are accepted, the energy histograms of adjacent systems 
should overlap. The sampling efficiency is modestly affected by the fraction of overlap. We 
arbitrarily chose to adjust the temperatures so that the probability of accepting a swapping 
move was roughly 0.1, and no attempt was made to optimize further these temperatures. 
We will show that the extra computational cost of simulating the higher temperature 
systems is more than compensated for by the increased sampling efficiency of the lowest 
temperature system. 



jWjfeWexp(-/3Uw; 
j(o)fc(o) e xp(-/3U(°)) 



(31) 
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2.5 Semi-Look- Ahead 



Since the conformations of the side chains determine the biological activity of peptides, 
effective sampling of side chains is important. Our method is based on the side chain 
dihedral angle moves in ref. J7j. A finite regrowth probability is assigned to each side 
chain of the molecule. A side chain move proceeds by regrowing the side chain unit by 
unit, beginning from the bond connecting the backbone to the side chain. At each step, 
Hi twigs are generated and used to calculate the new partial Rosenbluth factor. One of 
the twigs is selected with a probability proportional to the Boltzmann factor associated 
with that twig. The old configuration and n a — 1 random twigs are generated and used 
to calculate the old partial Rosenbluth factor. This procedure is repeated until the end of 
chain is reached. The new chain is accepted with the probability 



where is the product of the new partial Rosenbluth factors, and is the product 
of the old partial Rosenbluth factors. 

We propose a new method called semi-look-ahead (SLA) for side chain regrowth. For 
each torsionally-flexible bond, we define the group of atoms included in the partial Rosen- 
bluth factor to be the maximum set of atoms whose positions are uniquely determined 
by choosing the trial rotation of this bond. Figure |B1 sketches this new definition of atom 
groups and contrasts it with the one in ref. ^7] . Our definition includes atoms beyond the 
boundary of rigid units, including the head atoms of rigid units adjacent to the current 
one. We expect SLA to achieve better sampling efficiency and faster equilibration than the 
method without look-ahead |17j . due to the improved energy estimate for the biasing. 

The incremental energy at each step can be split into the internal and external com- 
ponents jHZj. With this decomposition, torsional angles are generated with a probability 
derived from the internal energy, and the partial Rosenbluth factors include the external 
energy only. In our system, only torsional energies can be put into internal energy, and 
these energies account for only a small fraction of the total energy. We find it most effi- 
cient to set the internal energy to zero and to include all of the energy within the external 
component. 

2.6 Look Ahead 

For long chains or chains with bulky units, a more extensive form of look-ahead may help 
to avoid proposing high energy configurations [221- The idea is illustrated in figure If we 
regrow the molecule by exploring the energy landscape only one rigid unit ahead, we will 
choose one configuration, as in figure [7^,. If we can look ahead two rigid units at one time, 
we may find the high energy region associated with that configuration and choose a more 
likely one instead, as in figure 03- 

We proposed two methods for look-ahead. The idea is to include a contribution from 
the energetic surroundings of the succeeding unit within the Rosenbluth factor. The first 
method, look-ahead (LA), generates rii trial rotations of the unit to be regrown and n-i trial 
rotations of the succeeding unit for each of the trial rotations of the first unit. In the second 



acc(o — > n) = min 1 




(34) 
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method, we set n\ = n 2 = n. We generate n configurations of the first rigid unit, with n 
configurations of the second unit associated to each. When regrowing the second unit, we 
use the n configurations already proposed during the regrowth of the first configuration. 
We, therefore, generate only the configurations for the third unit when regrowing the second 
unit. This method of look-ahead with recycled configurations is abbreviated as LARC. 

We now describe the procedure for carrying out these methods. Suppose we want to 
cut and regrow rigid units i = 1, . . . , N. The following procedure describes how to generate 
and accept these units: 

1. Generate a set of n\ trial torsional angles {0i(a)}, a = l,...,ni. Each angle is 
generated according to the internal potential of unit 1 

pf\a) = dexpi-pVfiMa)}} ■ (35) 
Denote the external energy of unit 1 at 0i(a) by Uf^a). 

2. For each trial 0i(a), generate a set of n 2 torsional angles {02 (a, 7)}. Each angle is 
generated according to the internal potential 

V [ n \a, 7) = C 2 exp{-/3U 2 nt [0 2 (a, 7)]} • (36) 

Denote the external energy of unit 2 at 0i(a) and 2 (a, 7) by U| xt (a, 7). For LA, 
the number n\ can be different from n 2 . For LARC n\ = n 2 . 



3. Define 



E;i!exp[-/5U e 2 xt («, 7)] 



and 



w 



(n) 
1 



m 

4. Pick a 0i (a) with the probability 

» 



wo '(a) = -7=1 ^ l " 2 (37) 

n 2 



EaLi gg* expJ-^Uf^)] ex V {-f3XJ?\a, 7)] 
n x n 2 

ESLi exp[-/3Uf *(aO]w^(a) _ 



^ ^[-^)]^(°) . (39) 



w 



To simplify the notation, we switch the labels of the chosen ath angle with the first 
torsional angle so that the chosen angle is first. 

5. Repeat steps CE}-® for rigid unit 2 to unit N-l, except that for LARC, the n 2 twigs 
of unit 2 corresponding to the chosen unit 1 are recycled to be the n\ (jii = n 2 ) trial 
configurations of unit 2, and so on. 
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6. For the Nth unit, which is the last unit, there is no need to look ahead, so we repeat 
step (1) and generate n\ torsional angles {0at(q;)}. Calculate 



w 



(n) 



ni 



Eexp[-/?U e /(«)] , (40) 



a=l 



and pick a <Pn{ol) with the probability 

«#>(«) = exp[ "^f' (a)1 • («) 

w N 

We also need to generate and calculate the old Rosenbluth weights: 

1. Generate n\ — 1 trial torsional angles with the probability given by eq. (jHSJ)- These 
angles and the original angle comprise a set of torsional angles {(pi(a)}. Let the 
original angle be labeled as 0i(l). 

2. For each <pi(a) other than 0i(l), generate n 2 torsional angles {<f) 2 (a, 7)}- For the 
original angle 0i(l), generate n 2 angles if the method is LA and n 2 — 1 angles if the 
method is LARC. For LARC add the original 4> 2 to the set of angles generated and 
label the original angle as 02(1, 1). All configurations other than the original one are 
generated according to the probability 

p 2 °\a, 7 ) = C 2 exp{-/3Uf [fc(a, 7 )]} • (42) 

Define w^a) and w^ in an analogous way as wj^a) and w^. 

3. Repeat the preceding two steps for unit 2 to N-l. 

4. For the Nth unit, which is the last unit, generate a set of n\ — 1 angles {^at(«)} with 
the probability given by eq. Add the original angle. Calculate . 

The proposed move is accepted with the probability 



acc(o -> n) = mm 1, — - , (43) 



w(°) 

where the Rosenbluth factors are defined as 



W (n) 



nf =2 wj n) (i) 



Y\ N w (o) 

W ( ) = iii=i Wi — _ , u . 

nf =2 wf(i) 

Note that the denominators of eq. ()44j) come from the bias introduced by eq. (|39|). In 
Appendix El we prove that the LA method satisfies detailed balance. 
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3 Results 



3.1 Backbone 



i 1 

We first apply the rebridging scheme to the cyclic peptide CG 6 C. Simulation results for 
the five different variations of the rebridging scheme were generated. All simulations were 
performed on a Silicon Graphic Indigo 2 195 MHz R10000 workstation. The system was 
equilibrated at 298 K. We used an optimized value of A0 max = 10° in all simulations except 
for WJM, in which the optimal value was A0 max = 30°. A probability of 0.05 was assigned 
for equilibration of either of the two side chains, NH 2 and COOH. These two side short 
side chains are well equilibrated by the method without look-ahead, which is used in our 
simulations. We define the acceptance probability P acc to be the ratio of accepted backbone 
moves to trial backbone moves. The efficiency of the Monte Carlo scheme is measured by 
the average displacement of the molecule per CPU time. We define A0 avg as the average 
of the absolute change of torsional angles per trial backbone move: 

E^rE? =0 |A<M*)| 
A 0avg = 77 • (45) 

Atrial 

This value is a measure of the size of successful moves and the efficiency of the re- 
bridging scheme. There is an intrinsic energy barrier for the C/3SSC3 dihedral angle at 
^c^ssc^ — 180°. The magnitude of this barrier is estimated to be 5.5-6.5 Kcal-mol -1 [TT] . 
A barrier-crossing event happens whenever this angle crosses ^c^ssc^ — 180°. We define 
the barrier-crossing frequency as the total number of barrier-crossing events divided by the 
total number of backbone moves. Table El lists simulation results obtained with the five 
different rebridging methods. 

Figure |H1 shows histograms of the angles ^c^ssc^ observed in these simulations. The 
NJ method yields a left peak that is slightly higher than those from other methods. The 
MT method yields the lowest left peak. Although the histograms are similar, they did 
not converge to a unique distribution within our chosen simulation time. This is because 
barrier crossing was not frequent enough to produce accurate statistics. 

To increase the sampling efficiency, we performed a parallel tempering simulation with 4 
systems. The system temperatures were 298 K, 500 K, 1000 K, and 3000 K. The rebridging 
moves were performed using the WJ biasing method. The probabilities for proposing swap- 
ping moves, backbone moves, and side chain moves were 0.1, 0.45, and 0.45, respectively. 
When a swapping move was chosen, two randomly chosen adjacent systems were proposed 
to swap configurations. The probabilities for swapping the two pairs with lower tempera- 
tures were doubled to accelerate de-correlations. When a backbone move or a side chain 
move was proposed, the system was picked with a probability that updates the two lowest 
temperature systems twice as frequently. We do this because of the longer correlation times 
at lower temperatures. The simulation consisted of 160000 Monte Carlo cycles. Each cycle 
proposed four swapping or updating moves, chosen at random. The whole CPU time taken 
in this run was 48 hours. The initial 20000 cycles were discarded to avoid equilibration 
effects. 

The swapping moves can occur with sufficient probability only if the energy histograms 
of adjacent systems overlap. Figure El shows that this condition is satisfied for our choice 
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of temperatures. Table |U lists the acceptance probabilities of swapping moves in this 
simulation. 

Figure Uni shows the distribution of the C^SSC/3 angle observed in the simulation. The 
histogram converged to a unique distribution with very little simulation data. After 80000 
cycles, the observed distribution was almost indistinguishable from the one observed at 
160000 cycles. With parallel tempering, we obtain substantially better statistics in less 
computation time. If fact, the computation time was two-thirds of that used in the single 
temperature simulations in figure |H1 Note that the histogram at 3000 K is essentially flat, 
and so at this temperature the molecule is free to cross the barrier at ^c^ssc^ — 180°. 
Strong steric repulsion between hydrogen atoms connected to the adjacent Cg atoms still 
prevents the molecule from adopting a conformation with </>c /3 ssc (3 — 0°, but this does not 
hinder equilibration. 

3.2 Side Chains 

i 1 

We performed simulations on the cyclic CNWKRGDC molecule to test various side chain 
regrowth methods. This medically-relevant molecule has long and bulky side chains. Sim- 
ulations were done both on a fixed backbone scaffold and on a backbone equilibrated with 
rebridging and parallel tempering. First, we fixed the backbone and chose side chains at 
random to regrow, using the method without look-ahead and the SLA method. We tested 
the dependence of the equilibration on the number of trial rotations n\. The backbone 
was fixed throughout this simulation. Figure ^2 shows the energy as a function of CPU 
time during the equilibration period. Starting from a high energy configuration, the SLA 
method with n\ = 100 or n\ = 10 reaches equilibrium rapidly. The non-look-ahead method, 
however, had difficulty in finding low energy regions. It took the system with n\ = 10 more 
than 50 minutes to reach low energy configurations. The system with n\ = 100, however, 
never reached equilibrium during the simulation. Although the associated acceptance prob- 
abilities are not small, the use of n — 100 results in essentially non-ergodic sampling. We 
point out that the non-look-ahead method equilibrates the system faster with n\ — 1 than 
with ni = 10 or n\ = 100, although non-look-ahead is always slower than SLA. 

Figure E] may prompt the following question: How do we determine the optimal value 
for ni? For short side chains, we expect that a small ni will work well. For longer side 
chains, we expect that a larger value of n\ will help to explore the torsional space. The 
optimal value, therefore, will differ for each side chain. 

We next performed parallel tempering simulations with five systems, using the SLA, LA, 
and LARC methods for side chain regrowth. The backbone moves were performed by the 
WJ biasing method. The system temperatures were 298 K, 450 K, 780 K, 1700 K, and 5000 
K. The simulation consisted of 100000 Monte Carlo cycles, except in the cases of n\ — 1 
and ni = 30 for SLA and ri\ x n<i = 20 x 10 for LA, for which the number of cycles were 
200000, 200000, and 60000, respectively. Each cycle proposed five swapping or updating 
moves, chosen at random. The probabilities for proposing swapping moves, backbone 
moves, and side chain moves were 0.1, 0.45, and 0.45, respectively. When a swapping move 
was proposed, two adjacent systems were chosen randomly, with the probability of picking 
system 1, 2, 3, and 4 equal to |, 1, |, and =, respectively. When an updating move, either 
for backbones or for side chains, was proposed, we chose system 1, 2, 3, 4, and 5 with the 
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probabilities |, |, |, ~, and |, respectively. We focused on the sampling efficiencies for the 
tryptophan, lysine, and arginine residues. The lysine residue has a large number, 5, of rigid 
units. The tryptophan residue has an indole group. The arginine residue has a guanidine 
group. Both groups are bulky and tend to have low acceptance probabilities. For each 
side chain, we used the total torsional displacement per computation time, A0/CPU, as 
an index to the efficiency Both side chain moves and swapping moves contributed to A(f>. 
We define the acceptance probability P acc to be the ratio of successful moves to trial moves 
in a side chain. The results are summarized in table 

Among the four simulations with SLA, the choice n\ = 10 yields the best efficiency for 
lysine, and n\ = 30 yields the best efficiency for tryptophan and arginine. Among the four 
simulations with LA, the best efficiency for tryptophan is produced when n\ x ri2 = 10 x 5. 
Lysine and arginine are equilibrated most efficiently with n\ x 712 = 10 x 10. With LARC 
the efficiency for tryptophan is the best when rii — 5. The efficiency for lysine is the 
best when n\ = 10. Interestingly, arginine is so difficult to equilibrate, typically having 
such a low acceptance probability, that the efficiency was best with n\ = 15. In general, 
LARC is more efficient than LA. Comparing the results from various methods, we find that 
tryptophan is equilibrated most efficiently by SLA, and lysine and arginine are equilibrated 
most efficiently by LARC. 

4 Discussion 

Among the five rebridging methods listed in table El WJO gives the highest acceptance 
probability, where P acc in WJO is defined to be the probability of accepting a solution other 
than the old one. WJO also produces the highest A0 avg . The distribution generated by 
WJO is the most smooth among the curves, which shows that it is efficient in sampling local 
conformations. However, the CPU time per move for WJO is slightly higher than that for 
WJ or NJ, since there is no early rejection in WJO. We performed simulations with WJM 
using different A0 max and found the optimal value to be A0 max = 30°. Simulations with 
A0 max < 30° were dominated by smaller moves that lead to infrequent barrier-crossing. The 
computation cost per WJM move is roughly proportional to the number of trial rotations. 
It is seen in table 01 that each WJM move takes more than twice the time of a WJ move. 
Therefore, WJM is less efficient than the first three schemes in table El As expected, MT 
yields a fairly low acceptance probability. Taking the CPU cost into consideration, the 
efficiency of WJ is close to that of WJO. The efficiency of NJ is less than WJ and WJO. 
The WJM method is less efficient than the previous three schemes. The MT method is the 
least efficient. 

Our rebridging scheme is capable of overcoming energy barriers and promoting the fre- 
quency of barrier crossing. The fourth column in table El lists the barrier-crossing frequency. 
The WJ method yields the highest barrier-crossing frequency, and WJO yields the lowest. 
This is due to the predominance of local moves in WJO. An accepted move in WJO can 
be a move that reconfigures six degrees of freedom only, which is less likely to lead to a 
barrier-crossing event. 

1 1 

Barrier-crossing is a rare event in a simulation of the CGeC peptide. According to the 
potential of mean force determined by umbrella sampling, the potential at ^c^ssc* = 90° is 
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less than that at ^c^ssc^ = 270° by roughly 1 Kcal-mol -1 [Ej. We, therefore, expect the left 
peak to be substantially higher than the right one. Our results with biased rebridging moves 
are consistent with the potential of mean force, but the statistics are not good enough. 
Because steric repulsions are severe in our system, the correlation time for other degrees of 
freedom is also long, and these degrees of freedom also slow down the barrier-crossing. We 
suspect that there is a set of low energy conformations, separated by low-energy barriers, 
near <f>c p saCft = 270°. 

We have found that parallel tempering is an efficient and automatic means to over- 
come these barriers. The overlap of energy histograms guaranteed reasonable acceptance 
probabilities of the swapping moves. These swapping moves transfer configurations en- 
countered at high temperatures to systems with low temperatures, thereby helping the 
low-temperature systems to escape from local energy minima. Such escape from local 
minima is important for efficient sampling, especially in glassy systems with high energy 
barriers. Cyclic peptides fall in this category, because of the torsional barriers and steric 
repulsions associated with the cyclic constraint. Our results provide additional evidence 
that parallel tempering is a powerful tool for studying glassy systems. Linear peptides, 
on the other hand, have a fairly simple free energy landscape, and so they do not benefit 
substantially from the parallel tempering approach [2%] . 

For equilibration of side chains, we tested whether the inclusion of torsional interaction 
energy in the internal potential is effective. We find that the acceptance probability is lower 
and the simulation time is increased through the use of internal biasing. Presumably this 
is because U mt is only a small fraction of the total interaction energy, and so biasing the 
torsional angles according to this term does not lead to better sampling. 

CBMC without any look-ahead does not equilibrate long or bulky side chains as well 
as does CBMC with look-ahead. The key difference is that without look-ahead, the head 
atoms of succeeding units are not included. Without look-ahead, a chosen rotation, though 
probably a low energy configuration for the local atoms, may implicitly put adjacent head 
atoms in high energy positions and thereby fail to find the lowest energy region. Using 
fewer twigs in the non-look-ahead method resulted in better equilibration, as shown by the 
results for n\ — 1 in figure This occurs because with n\ = 1 the regrowing units have 
a better chance to miss the incorrectly identified low energy regions. 

Increasing the number of twigs raises the acceptance probability in SLA, LA, and LARC, 
but at an increased CPU cost. The optimal n\ is attained when these competing effects 
are balanced. We know that for rougher energy landscapes more trial rotations need to be 
generated. From the first row of table all three residues were poorly equilibrated by SLA 
with ri\ = 1. The torsional displacement A(f> in this case comes mainly from the swapping 
moves. Equilibration is improved by using a greater m, which increases the acceptance 
probabilities significantly. However, the acceptance probability for arginine with m = 100 
is lower than that with n\ = 30. This means that improving the local sampling does 
not always lead to better global sampling, and this in turns implies the necessity of more 
significant look-ahead sampling. The arginine residue is both long, with four rigid units, 
and big, with a guanidine group at the end. Therefore, look-ahead is crucial to bypass high 
energy regions. Comparing the results for SLA with n\ = 10, LA with m x ri2 = 10 x 10, 
and LARC with n\ = 10, we see both LA and LARC enhance the acceptance probabilities. 
The only exception is the shortest residue, tryptophan, for which LA yields a acceptance 
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probability slightly lower than that from SLA. Clearly, LARC is superior to LA, because 
LARC costs less computation time while yielding higher acceptance probabilities. For the 
long residues, lysine and arginine, LARC yields the highest efficiencies among these three 
methods. The results suggest that, for long and bulky side chains, significant look-ahead is 
necessary. It is not necessary to use the same regrowth method for all side chains. Indeed, 
the optimal approach is to use a different regrowth method for side chains of different 
identity. For short side chains, SLA appears to be optimal. For longer side chains, LARC 
is the best method to use. We believe there may be some cases in which look-ahead is the 
only efficient approach for equilibration. Likely cases are those where there is substantial 
crowding and steric overlap, such as docking of a drug or signaling molecule to a protein 
receptor site or binding of antigen by the hypervariable region of antibodies. 

5 Conclusion 

Peptide function comes primarily from the chemical functionality of the side chains atoms, 
although the side chains themselves are positioned by the backbone atoms. For cyclic pep- 
tides, both backbone and side chain atoms are difficult to equilibrate with standard simula- 
tion techniques. We have described a new and efficient Monte Carlo simulation method for 
complex cyclic peptides. The combination of biased, look-ahead Monte Carlo and parallel 
tempering leads to rapid and accurate sampling of the relevant room- or body-temperature 
conformations. Specifically, the look-ahead biasing is helpful for equilibrating long or bulky 
side chains, and the parallel tempering is essential for crossing torsional-angle free-energy 
barriers at a rapid rate. A variety of details, such as prescreening, improved Jacobian 
biasing, semi-look-ahead, and look-ahead, are important components of the method. 

We believe that parallel tempering will prove to be a generally useful method for simu- 
lation of 'glassy' atomic systems with multiple, important conformations separated by large 
and unpredictable free energy barriers. Explicit atom models, which are more accurate but 
which also increase the ruggedness of the potential energy landscape, are naturally treated 
within this approach. We expect that application of our peptide simulation method to 
high-density or crowded situations, such as peptide-receptor or antibody-antigen binding 
events, will further demonstrate the efficiency and power of our approach. 
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A The Jacobian in the Rebridging Scheme 

In the rebridging scheme, each solution should be weighted by a Jacobian to correct for the 
non-uniform distribution of the angles 0i, . . . ,06 generated by the non-linear solution of 
the geometrical problem. We derive the Jacobian here from the classical partition function. 
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We initially consider a simple cyclic molecule with only N backbone atoms and N back- 
bone torsional degrees of freedom. This assumption is relaxed at the end to accommodate 
the complicated backbone and side chain geometry of a real peptide. The momentum part 
of the partition function can be integrated out if we assume that the bond length and angle 
constraints are enforced by springs with infinite force constants. We, thus, focus on the 
configurational part. The configurational partition function is 

Z = J dr N exp(-/3U) 

= J dr N dr N+1 dr N+2 dr N+3 5 3 (r N+1 - ri)5 3 (rjv +2 - r 2 )5 3 (r N+3 - r 3 ) exp(-/3U) , 

(46) 

where we have introduced three vector delta functions to account for the cyclic constraint. 
The choice of fixed-end constraints is not unique. We will discuss an alternative form later 
in this section. 

We start the derivation by performing a transformation from r N to y N 

yi = ri 

y, = % = 2, . . . , N + 3 . (47) 

The Jacobian of this transformation is unity. We transform again from y N to local coor- 
dinates. We define U = |yj| and 6>j to be the angle formed by and y^+i. We transform 
from y 2 to l 2 and u 2 , where 



y2 

U 2 = 1 1 ■ 

ml 

(48) 

Then we transform from y 3 to Z 3 , 9 2 , and 72, where 72 is the azimuthal angle of u 3 in 
a spherical coordinate system defined with u 2 as the z-axis. The angle is measured with 
respect to the plane defined by u 2 and e 3 , the fixed laboratory z-axis. We further trans- 
form yj to a spherical coordinate system Zj, and fa-i, i = 4, . . . , N + 3. With this 
transformation, we obtain 

f N+2 

dyil 2 2 dl 2 du 2 l 3 2 dl 3 d9 2 sm6 2 d / y 2 / ]~[ dl i+ id9id(f>i 

J i=3 



x 



(N+l \ 

63 ( E yj I ^ 3 (yJv+2 - J2) 5 3 (y N+3 - y 3 ) 



x J (- n 5— «P(-/?U) • ( 4 9) 

\<4, . . . , Ln+3, V3i ■ ■ ■ , VN+2, (p3,..., <pN+2 J 

The Jacobian is simply J = Ilil 3 2 k+i 2 sin 6i. The fast coordinates k and 0, are fixed 
due to the strong harmonic potentials. We denote the equilibrium values of k and d, L by 
l\ and 0°, respectively. With very large spring constants, the dependence of the integrand 
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along these coordinates can effectively be replaced with delta functions. Therefore 
Z = C" [ d yi l 2 2 dl 2 du 2 l 3 2 dl 3 d0 2 sm6 2 d l2 5{l 2 - l° 2 )5(l 3 - l° 3 )5(0 2 - 9° 2 ) 



x Yl dl i+ id6id(f)i J(/ 4 , • • • , /at+3, #3, • • • , On +2) exp(-/3U ) 

J i=3 

/JV+1 \ 

x 63 I E yjj 63 (yn+2 - y 2 )5 3 (yAr +3 - y 3 ) 



A? 



N-l 



N 



x n - '*) II - E y*"l - C 

fc=4 fc'=3 fc"=2 



X 5 COS 



-1 Y2 ■ (- Ee=2 y*') x yiV ' (~ ^?=2 yj') 



|y2|| Ei» =2 yi" 



&l \S[ cos" 



|yjv|| i^j"=2yj" 



0% > 



(50) 



where Uo is the potential energy measured at the ground configuration of these hard coor- 
dinates. We now integrate over the hard coordinates U and 0j. The Jacobian is simply a 
constant and can be taken out of the integral. Since the constraint 5 3 (Z)S 1 Yi) holds, we 
can replace every — X)£L 2 y» with yjy+i- Similarly, we can replace y 2 with Yn+2- Replacing 
the arguments in the last two delta functions with N+ i and N , respectively, we obtain 



Z — C" dyid\i 2 d^ 2 j JJ dl i+ id9id(pi exp(-/5U ) 

J J i=N 

/jv+i \ 

x 53 1 E y^ j £ 3 (yjv+2 - y 2 )5 3 (yiv + 3 - y 3 )£(|yjv+i| - 1°) 
x 5 (e N+1 - 0°) 5 (o N - 0%) . 

We use the equalities 

5 3 (yiv+2-y2) = 5(l N+2 - l 2 )5 2 (u N+2 - u 2 )/l N+2 2 
£ 3 (yiv+3-y3) = S(l N+3 - l 3 )6 2 (u N+3 - u 3 )/l N+3 2 

to integrate over Zjv+i? In+2, In+3, On, and N+1 to obtain 

„ , N+2 . 

Z = C J rfyirfu 2 (i72 y n d(f)i J d6 N+2 exp(-/3U ) 



(51) 



i=3 



'iV+1 



x 58 ( E Vj ] 52 (u N +2 - u 2 )S 2 (u N+3 - U 3 ) 



(52) 



Note that 



y d^ +2 <J 2 (ujv+3 - us) 



y d^+2<5(7Ar+2 - 72)^(^+2 - 2 )j sin ^2 
5(7^+2 - 72)/ sin 6*2 . 



where 



cos 



-1 y3 • yjv+2 

|y3l|yAr+2| 



(53) 



(54) 
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and 7a?+2 and 72 are the azimuthal angles of utv+3 and u 3 in a spherical coordinate system 
defined with uat+2 = u 2 as the z-axis. The angles are measured with respect to the plane 
defined by u 2 and e 3 . Integrating over 9^+2, we obtain 



N+2 



Z = C J dyxd^d^ J IJ 

i=3 
/N+l 

exp(-/3U )5 3 [J2 yj ] s2 (un+2 - U2)S(lN+2 - 72) • (55) 



X 



\i=2 



This is the partition function of a classical, cyclic molecule. We see that it is an integral 
over torsional space with delta function constraints. These constraints cause an intrinsically 
non-uniform distribution of every torsional angle, even in the absence of any energy of 
interaction. It is convenient to transform the last six torsional coordinates to the variables 
rjv+i, Ujv +2 , and 4>n+2 and to integrate over these six coordinates. Then 



N— 4 

Z = C J dyidvL 2 d^ 2 J J[ dfa f dr N+1 du N+2 d^f N+ 2 



i=3 

x £ \h f 0jV - 3, :" ,0jV+2 ) exp[-/3Uo(A0]) 5 3 (5>i I ^+2 " u 2 )5(7v +2 - 72) 

k=l { \ T N+1, U- N +2,lN+2) J \ j=2 

, N—A k B 

exp[-p\J {k)] } . (56) 



f -iV-4 k s 

C / dy x dvi2dl2 I \{ d<pi £ < J fe 

J J i=3 k=l I 



>N-3 ■ ■ ■ <PN+2 



k=l I \ r JV+l) u V+2, 7at+ 2/ 



ri,u 2 ,72 



The index k labels the solutions {0tv_ 3 , . . . ,0tv +2 } that satisfy the fixed-end constraints. 
The summation accounts for the fact that multiple solutions are possible. In the rebridging 
scheme, we always relabel 0jv-3> • • • > 4>n+2 as (pi, ... , </> 6 and r 1; u 2 , and 72 as r 5 , u 6 , and 7 6 . 
From eq. it is clear that each solution must be given a weight, which is the Jacobian. 

The 6x6 Jacobian is actually the determinant of a 5 x 5 matrix, since the last torsional 
angle does not affect r 5 or u 6 . Therefore, 

dr 5 = 8uq = 



We also know that 



So we obtain 



^76 

d(j) 6 




"6 • e 3 

det|5| 

= [Uj x (r 5 - Tj)]i, if j < 3 
= [iij x u 6 ] j _ 3 , if j = 4 or 5. (57) 

Since the Jacobian is independent of 6 , we might conjecture that it is also independent 
of 0i. The reason is that the Jacobian should not depend on the direction that we choose 
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for the labeling of the rigid units. Hoffmann and Knapp derived a 4 x 4 Jacobian depending 
only on 02, 03, 04, and 05 for case 6 of tabled t 38j. We will show that, with suitable choice 
of end-constraint variables, a 4 x 4 matrix can be derived in all cases. The idea is to choose 
a set of end coordinates that are almost independent of 0i . 
Integrating eq. over 4>n+2, we obtain 



f r N+l /N+1 \ 

Z = C J d yi du 2 d l2 J J] dfa exp(-/5U )5 3 E ^ S 2 (u N+2 - u 2 ) • 

i=3 \ i— 2 J 

Let Ar = r N ^ — r N „% and introduce the following end coordinates 



(58) 



R 



lArl 



cos 



the torsional angle of Ar in local coordinates of unit N — 3 
i / Ar ■ uat+2 \ 



cos 



e = the torsional angle defined by uns, Ar, and uat+2 • (59) 

Note that R, 9 b , 9 e , and e are independent of 0at_3 and that 0& is linear in 0jv_3. Substi- 
tuting these coordinates into eq. we obtain 

Z =C / c/yi<iu2(i72 / J| c?0i exp(-/3U ) 



R? sin sin 9 e 

x - |r a - r N ^\)8(4> b - <f>° b )8{9 b - 9° b )5(9 c - 9° c )5(<Pc - €) ■ 



(60) 



Here 



el 



ft 

92 



cos 



-1 / ( r l — r Ns) " &N-3 



l r i _ rjv-3| 

the torsional angle of n — r^_ 3 in local coordinates of unit N — 3 

! /(n - rjv-3) • u 2 ~ 



cos 



I r l — r N-3\ 

0g = the torsional angle defined by uat-3, Ar, and u 2 . 
Transforming coordinates from 4>n-3, ■ ■ ■ , (pN+i to R, 9 b , 06, 9 e , and e , 

AT-4 „ y 



(61) 



we obtain 



Z = C J dyidvL 2 d^2 J \\ dfa J dRd9 b dcf) b d9 c 

i=3 



R 2 sin 9h sin 9 P 



x - |n - r N -3\)5(9 b - 9° b )5((j) b - 4)5(9 C - # c >(0 e - 0° 



X 



E J* 



fc=i 



-R, b , b , e , 0„ 



AT-4 



1^-^-31,^,^,^,08 

1 



exp[-/3U (fc)] 



x 



E J* 



i=3 

AT-3, • • • , 07V+2 



.R 2 sin sin 6L 



k=l 



R, 9 



, Vb, <Pb, <7e> S^e, 



exp[-/?U (fc)] 



(62) 



|n-rw-3l,«g.^.flg,08 
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The Jacobian can be rewritten as 



R 2 sinfl fe sinfl e |det(B") 



(63) 



where 



OR „ d9 b dcj) b 

y ~~ T>1 ' 2 o ~ ~al ' a 3 i — ~al ' 

CKpN-A+j 0<pN-A+j C)<pN~A+j 

B 4 J = «7 ' B Bi = 7^7 , j = 1, . . . , 5 . (64) 

OCpN-i+j 0(pN-4+j 

The first column of B" has only one non-zero element, which is B"3i = 1. Taking the 
cofactor of B'21, the determinant can be replaced with that of a 4 x 4 matrix 

_ OR _ d9 b _ d9 e 

<90jV_3+j 0(j) N -3 +j 0(j) N - 3+j 

B% = , J = l,.-.,4. (65) 

It is easy to extend our approach to include side chains and constrained torsional angles. 
Following an approach parallel to eqs. (|46 )) -([49 )) . we obtain an integral with additional 
degrees of freedom contributed by side chains. These degrees of freedom are not constrained, 
and they can be integrated out first. Therefore, we can simply replace U with an influence 
functional. The final form of the Jacobian is unaffected. For peptides, rotation about the 
C-N bond in the amide group is governed by a large force constant. In our simulation, we 
constrain these torsional degrees of freedom as well. Each constrained bond adds a delta 
function to eq. (j55j) . Let A be the set of fa that are constrained. Then 

. „N+2 /JV+1 \ 

Z = C J dyidu 2 d7 2 / J| dfa exp(-/?U )£ 3 I £ y, S 2 (u N+2 - u 2 )5("f N+2 - 72) 

i=3 \j=2 / 

x II K<Pk - 4>° k ) . (66) 

Let G(l, N + 2) denote the last I flexible torsional angles from 3 to 4>n+2- Integrating 
out the other degrees of freedom, we obtain 

Z = C dyidu 2 d7 2 / J[ 

\i<£[AuG(6,N+2)] 

G(6,iV + 2) 




r Ar+l, Uat + 2, 7iV+2 / 



exp[-pU (k)] } . (67) 

ri,U2,72 



If 4>n+2 is constrained, tn, i"iv+i, rjy+2, and rjv + 3 define a rigid unit. The corresponding 
fixed-end coordinates in our algorithm are chosen to be r^v, ujv+i, and Jn+i, instead of 
rjv+i, U7V+2, and 7^+2- This apparent difference causes no ambiguity, since both sets define 
the same rigid unit. The Jacobian between these two sets is unity. 
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Relabeling the torsional angles in G(6, N + 2) by <pi, . . . , we recover the Jacobian 
in eq. (jSJ. The 4x4 Jacobian in this case can be derived analogously. The final result, 
which is numerically equal to eq. (jSJ), is 

1 



i2 2 sin0 & sin0 e |det(B') 
The components of B' are given below: 

• Ar 



where 



B'lj = 


BR 


1 <9Ar 


d(j)j 


R d<j>j 


B' 2 j = 


de b 


-1 


d(f)j 


R sin 9 b 


B' 3 j = 


oe e 


-1 




R sin e 


B' 4j = 


d(f) b 

d(j)j 





-1 



1 , . A <9Ar / 
-— B yAr • ui + — — • ui 

R o<pj 

— 1 , 9Ar 

— B yAr • u 6 + — - ■ u 6 + Ar • (u 3 - x u 6 ) 



[Hi x Ar) ■ (Ar x u 6 ) 



R? sin e sin 6 b sin e [ R? sin e sin 0^ sin 6 e 
( dAr 

x 2—— • Ar sin b sin 9 e + R 2 cos 9 b sin e B' 2 j + -R 2 sin 9 h cos e B' 3 j 



9Ar /<9Ar 
(ui x -^-) ■ (Ar x u 6 ) + (ui x Ar) ■ \^—- x u 6 + Ar x (u,- x u 6 ) 



(6J 



9Ar 



u, x (r 5t - r,- h J 



The quantities needed to calculate B' are iii, Ar, 9 b , 9 e , and sin0 e - 

B Detailed Balance for LA 

In this appendix we prove that the LA method satisfies detailed balance. The proof for 
LARC can be done analogously and is not presented here. In our algorithm, the old 
Rosenbluth factor is not evaluated until all units have been given new positions. 

In fact, W^-* can be calculated at any time. In the proof, we calculate the partial old 
Rosenbluth factor wj '* of unit i once a new proposed move for unit i is made. We first 
derive the probability for proposing a forward move of the first unit. By analogy, we derive 
the probability for proposing a reverse move of the first unit. Since we generate both the 
old Rosenbluth factor and the new Rosenbluth factor in a random way, their probabilities 
should be included. This is the so-called super detailed balance condition 11J. We will 
show that LA satisfies super detailed balance. 
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Let at (o-rn; {^ n) (a)}, {#(a, 7)}, {^(a 7 )}, {4° ) (a , > 7')}) be the probability of 

proposing a move from </>i° ; (l) to 0^ (1), given {0^ (a)}, {02 (ck, 7)}' {^i^t"')}' an d 
{02°^ (a', 7')}- Consider the following three events: 

1. Generating ra 1 n 2 new twigs, which has the probability 



m «2 

nM n) («)n^ n) («, 7). 

Q = l 7=1 



2. Picking a new twig, which has the probability gf^(l). 

3. Generating riiU2 old twigs, which has the probability is 

rt2 ni / 712 

nri o) (i ; 70 n U°V) nrfV, v) 

y=l a '=2 \ 7 '=1 

The probability of the whole event, (o — * n; {0i (a)}, {0 2 ( a > i)}> i^i^i ')}^ {02°^( a '; 7' 
is the product of these three probabilities. Multiplying the three terms together, we obtain 

ni I n 2 \ n 2 ni / n 2 \ 

n M n) («) nrf n) («, 7) x ^(i) x n ^a, y) n U°V) n ?4V, y) (69) 

a=l \ 7=1 / 7 '=1 a'=2 \ j'=l J 

Similarly, the probability of proposing the reverse move, 

a 'i 



(n^o; {0!V)M02°V, f)}, {#(«)},{# (a, 7)}), is 



ni / n 2 \ ri2 "d / "2 \ 

n U°V) n ^°v, y) x g W(i) x n^ n) (i, 7) n n^fa 7) .(70) 

a'=l \ 7'=1 / 7=1 a=2 \ 7=1 / 

We define U'^ xt as the external energy for unit 1 in the old configuration and U° xt as 
the external energy for unit 1 in the new configuration. Taking the ratio of eq. (JoTJj) an d 
eq. (J7UJ), most of the probabilities for generating the twigs cancel. Replacing (ft (1) and 
(/i° ; (l) with eq. (J39j) . we obtain 

q^o-m; {^\a)},{^\a, 7)}, M°V)}, {0 2 °V, Y)}) 
^(n-o; {0S O) («')},{4 O) K Y)}, {<M n) («)},{#(«, 7)}) 

P S n) (l)exp[-/3Ur' n (l)]w( n) (l)wi o) 



p(°)(l)exp[-/3Ur' (l)] w ( o) (l)wi n) 
_ cxp(-/3Ul n) )w^ n) (l)wS o) 
~ exp(-M 0) )w2 0) (l)wi n) ' 



(71) 



where we have used eq. (J35|) to obtain the last line. Similarly, we can obtain the ratio of 
probabilities for subsequent units. The ratio of the transition probabilities is the product 
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of these ratios and the ratios of the acceptance probabilities. Multiplying eq. (|7T|) for each 
unit and using eqs. (|4*3J) and (JUJ), we find that super detailed balance is satisfied: 



q(o -» n)acc(o -» n) njlx exp(-/?uf } ) ^ Ilf =2 wj n) (l) nti^ W"' 
a(n - o)acc(n - o) exp(-/5Uf ) ) X n? =2 w?(l)n? =1 w? W<o) 

(72) 



exp(-/?U< n >) 
exp(-/3U(°)) ' 
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Table 1: Constraint equations and target functions. In case 6, X can stand for either A or 
B, and 4>' 3 is denned by eqs. (IT2j) and (fT^|) . 



Case Units(l-5) 
1 AABAB 


Geometrical Constraints Classification Comments 
r 4 W>6)-r 2 (0i)| 2 -Z 2 / = O dist 6 =/(0i) 

&3(01,02) -U 6 -COS04 = dotl 2 =/(0 1 ) 

r 4 [0 6 (0i)] - r 3h [0 2 (0i)]| 2 - ^3h,4 2 target =/(0i) 


2 AAAAB 


r 4 (0 6 ) - r 2 (0i) 
rst - r 3 (0i,0 2 ) 
r 4 [0 6 (0i)] - r 3 


2 -/ 2 / = dist 06=/(0l) 

• u 6 - hA cos l9 4 - Z 4i5h - Z 5h ,5t cos 5 t = dot 2 =/(0 1 ) 
[02(0i)H 2 -^3,4 2 = O target 0-/(00 


3 BABAB 


r 4 (0 6 ) - r 2 (0i) 

&3(0 2 ) • U6 " COS 

r 4 [0 6 (0i)] - r 3t 


2 -« 2 / = dist 06=/(0l) 

t9 4 = quad determine 2 
(0i,0 2 )| 2 -/ 3 h.4 2 = O target =/(0 4 ) 


4 BABAA 


r 4 (0 6 )-r 2 (0i)| 2 -Z 2 / = O dist 0i=/(0e) 
d 4 (06,05) ' Ui - cos6> 2 = dotl 5 =/(0 6 ) 
r 2 [0x(0 6 )l-r 3 t [05(06)1 1 2 -Z 2 ,3t 2 = target =/(0 6 ) 


5 BAAAA 


r 4 (0e) -r 2 (0i)| 2 -Z 2 ,4 =0 dist 0i=/(0 6 ) 
i"ih - r3(0 6 ,0 5 )] • ui + Z 2i3 cosi9 2 + /i hj it cos^ih + Zit,2h = dot 5 =/(0 6 ) 

r 2 [0i(06)]-r3[0 5 (06)]| 2 -^,3 2 = O target =/(0 6 ) 


6 AXAXA 


r3(0 3 )-r 2h (0i)| 2 -Z 2 h,3 Z = O dist 0i=/(0 3 ) 
r 3 (0 3 )-r 4t (0 6 )| 2 -Z 3 , 4t 2 = O dist 6 =/(0 3 ) 
rat - r 4h [^ 6 (^),^]| 2 - ; 2 t,4h 2 = target =/(0 3 ) 
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Table 2: The mathematical form of the constraint functions. Here i, j, and k are labels to 
units, and d is a constant that varies from case to case. The reference units of i and j are 
i' and j', respectively. The reference unit of j' is j". The labels a and b are either h or t, 
depending on whether the notation is forward or backward. We define a* = h when a = t, 
and a* = t when a = h. The torsional variables that appear in the constraint equations are 
4>i and 011. The constant characteristic matrix is M. The last row defines notation used in 
this table. Notation not defined here is defined in Sec. I 



Type 


General function form 


M (used in *iM* n = ) 


quad 


Ui{(fn) ■ u? - d = 


not applicable 


dist 


|r w (</>i) - rjbOn)! 2 - d = 


(|l*l'o — Yj'b 2 + h'a,ia 2 + h'bjb 2 ~ d)G 

~ 2L a T T^ b T T'v b L j / 6 + 2W Q T T^ bT r(iy a - iy b ) 
-2[r(r l , a -r^ b )] T T^ b L J v 6 


dot 


[r w - r jb (<f)i, (j) n )} ■ u fe - d = 


[(r M - r r , b ) • Qfc - d] C - V 6 T T^, b ' r(u fc ) 
-Gj»(u fc )T.„ e L J / 6 


dotl 


Uj (0i, 0n) • iife - d = 


-dC + Gjv/ffifcJT^Aj-, 



c 



Gi(x) 



Li' a 



1 





r(x) 





Xy 

x 2 



V 










1 "Trplab 



T lab T x 

T' abT x 







TV" x 



I rr(x)] 1 t' ; 



, if 0< ^ 

, if e % = o 



li'h.it ~\~li'a*,ia ~t~ ^^i / h,t / t^* / a*,ia COS ( 



Table 3: Comparison of simulation results with different rebridging methods at 298 K. For 



all simulations A0 max = 10°, except for WJM, in which A0 max = 30°. 

Method A0 avg (deg) P acc -P CT oss Number of steps CPU time (hrs) 

NJ MM 0.162 0.000303 4 x 10 a 32 

WJ 7.291 0.172 0.000414 4 x 10 5 32 

WJO 10.006 0.177 0.000167 4 x 10 5 34 

WJM 8.525 0.111 0.000468 2 x 10 5 40 

MT 2.699 0.051 0.000222 4 x 10 5 30 



Table 4: Acceptance probability observed for swapping moves in the parallel tempering 
simulation. 

Swap -P acc 

298 K <-> 500 K 0.147 
500 K 1000 K 0.113 
1000 K 3000 K 0.136 
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Table 5: Simulation data for tryptophan, lysine, and arginine residues of 
CNWKRGDC peptide. 



Method 


n\ xn 2 


Residue 


A0/CPU (deg-min- 1 ) 


p 

1 acc 


CPU (min) 


SLA 


1x0 


Trp 


142 


0.0317 


2940 






Lys 


82 


0.00229 








Arg 


90 


0.00150 




SLA 


10x0 


Trp 


158 


0.0846 


1620 






Lys 


575 


0.187 








Arg 


152 


0.0325 




SLA 


30x0 


Trp 


315 


0.313 


3910 






Lys 


526 


0.214 








Arg 


290 


0.131 




SLA 


100x0 


Trp 


153 


0.373 


3220 






Lys 


445 


0.304 








Arg 


130 


0.0821 




LA 


5x5 


Trp 


131 


0.0660 


2040 






Lys 


271 


0.0947 








Arg 


157 


0.0529 




LA 


10x5 


Trp 


147 


0.0686 


2260 






Lys 


375 


0.169 








Arg 


205 


0.0792 




LA 


10x10 


Trp 


120 


0.0841 


2800 






Lys 


397 


0.220 








Arg 


225 


0.118 




LA 


20x10 


Trp 


28 


0.120 


2500 






Lys 


355 


0.348 








Arg 


55 


0.0811 




LARC 


5x5 


Trp 


213 


0.116 


1690 






Lys 


673 


0.221 








Arg 


327 


0.100 




LARC 


10x10 


Trp 


211 


0.206 


2560 






Lys 


749 


0.421 








Arg 


248 


0.146 




LARC 


15x15 


Trp 


151 


0.334 


4250 






Lys 


548 


0.497 








Arg 


346 


0.357 




LARC 


20x20 


Trp 


104 


0.415 


6420 






Lys 


394 


0.571 








Arg 


212 


0.350 
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Figure 1: A typical cyclic peptide, CNWKRGDC. The disulfide bond is shown at the 
left, spanned by two (spherical) sulfur atoms. Note that this is not the lowest free energy 
conformation of this molecule. 

Figure 2: Sketches for units of class A and B. The angle 6 is zero for class B. Only backbone 
atoms are depicted here. Bold lines indicates bonds that do not rotate. 

Figure 3: A segment selected to be rebridged. Change of driver angles O and 07 breaks the 
connectivity. The dotted area represents the region in which the positions of the backbone 
atoms must be restored. 

Figure 4: The rebridging method applied to an ABABA segment. It can be seen that 
| i"i — r 3 | and |r 3 — r 5 | are constants. The dotted area represents the region in which the 
positions of backbone atoms are to be restored. 

Figure 5: A typical target function. Only the 3 domain where the target function exists 
is shown here. The number of branches is four. 

Figure 6: Shown as solid is the definition of atom groups used for side chain regrowth: a) 
in non-look-ahead and b) in semi-look-ahead. The left two bonds are connected to other 
rigid units. 

Figure 7: Schematic pictures for generating trial moves. The solid circle denotes an 
existing unit. Dotted circles represent trial configurations of the next unit, a) CBMC 
without look-ahead generates and regrows one unit at one time. Configuration 2 has the 
lowest energy and is most likely to be picked, b) In look-ahead, we generate two units 
and regrow one unit. The configurations generated from 2 turn out to be disfavored, and 
configuration 4 is chosen instead. 

Figure 8: The probability distribution for the C^SSCg disulfide torsional angle observed in 
NJ, WJ, WJO, WJM, and MT. 

Figure 9: The energy histograms for all the systems in the parallel tempering simulation. 

Figure 10: The histograms for the C3SSC/3 disulfide torsional angle observed in the parallel 
tempering simulation. The total number of Monte Carlo cycles is N. The distribution at 
3000 K is shown for comparison. 



Figure 11: Equilibration of the side chains of CNWKRGDC. The numbers at the upper 
right are the values of n\. 
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Figure H Wu and Deem, 'Efficient Monte Carlo. . 
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Figure EJ Wu and Deem, 'Efficient Monte Carlo. 
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Figure El Wu and Deem, 'Efficient Monte Carlo. . . ' 
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Figure [3 Wu and Deem, 'Efficient Monte Carlo. . . ' 
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Figure IHJ Wu and Deem, 'Efficient Monte Carlo. 
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Figure 03 Wu and Deem, 'Efficient Monte Carlo. . . ' 
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Figure Wu and Deem, 'Efficient Monte Carlo. 
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Figure ITT1 and Deem, 'Efficient Monte Carlo. 
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